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Suppose that three kinds of quantum systems are given in some unknown states |/) 0iV , \gi)® K , 
and \g2)® K , and we want to decide which template state \gi) or |#2), each representing the feature 
of the pattern class C\ or C2, respectively, is closest to the input feature state |/). This is an 
extension of the pattern matching problem into the quantum domain. Assuming that these states 
are known a priori to belong to a certain parametric family of pure qubit systems, we derive two 
kinds of matching strategies. The first is a semiclassical strategy which is obtained by the natural 
extension of conventional matching strategies and consists of a two-stage procedure: identification 
(estimation) of the unknown template states to design the classifier (learning process to train the 
classifier) and classification of the input system into the appropriate pattern class based on the 
estimated results. The other is a fully quantum strategy without any intermediate measurement 
which we might call as the universal quantum matching machine. We present the Bayes optimal 
solutions for both strategies in the case of K = 1, showing that there certainly exists a fully quantum 
matching procedure which is strictly superior to the straightforward semiclassical extension of the 
conventional matching strategy based on the learning process. 

PACS numbers: PACS numbers:03.67.-a, 03.65.Ta, 89.70,+c 



I. INTRODUCTION 



Distinguishing quantum systems is one of the central tasks in quantum information theory. We have a useful 
formalism known as quantum detection and estimation theory for dealing with this problem |l|, |2|, || . Recent progress 
in quantum communication and computation provides motivations to generalize this theory and apply it to various 
new situations. Depending on our purposes there may be various scenaria in the problem of distinguishing quantum 
systems. The systems to be distinguished can be sometimes a set of given quantum states, and sometimes a set of 
possible quantum dynamics. These systems are usually generated by a quantum source which is expected to have 
certain characteristic features. If the source generates a completely random phenomena, then it is impossible to 
extract any meaningful information from it and therefore such a case will not come into our consideration. In a broad 
sense, we may then essentially have three possible circumstances: 

• (1) the source identity, i. e. a set of possible quantum systems and associated probability distribution, is 
completely known; 

• (2) the source identity is unknown, but it belongs to a parametrized family of quantum systems and probability 
distributions; 

• (3) the source is known to be stationary and ergodic, but no other information is available. 

The case (1) has long been a main subject of quantum detection and estimation theory. However, the other two cases 
are becoming of practical importance in quantum information technology. Suppose, for example, we are interested 
in finding efficient representations of incoming random sequences of quantum states. If the source identity is com- 
pletely known then we have well known theorems on the asymptotic average length of codewords and efficient coding 
algorithms are being developed and will be of practical use in the near future. 

Consider then the situation in which the source identity is not completely known, which is indeed the case when 
dealing with a realistic quantum source. The obvious way to proceed would be by direct estimation of the source 
identity, which is then used in the coding algorithm in place of the unknown information of the source. When the 
source is known to be a member of a parametric family then the unknown parameters are readily estimated from the 
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incoming training data. With enough data the estimate will be sufficiently close to the truth and the representation 
will be nearly optimal. On the other hand, if only a limited number of training data are available, one has to consider 
an appropriate estimation strategy which would hopefully be not only asymptotically optimal when the length of the 
training set tends to infinity, but be also optimal for intermediate amounts of training data. This kind of problem 
is known as learning strategy in conventional information theory, in particular in pattern matching theory ^|, |). 
Reasonable criteria which are usually assumed for a good strategy are: 

• (i) no knowledge of the source is required; 

• (ii) the delay due to the learning process is not long; 

• (iii) the strategy should be simple and easy to implement. 

The purpose of this paper is to develop a formalism for the quantum learningstrategy and to apply it to the problem 
of distinguishing quantum systems in cases (2) and (3). In a recent paper |8), the authors considered the problem 
of quantum pattern matching, in which each pattern class d is represented by a known quantum state \gi) called 
a template state, and the task is to find a template which optimally matches a given unknown quantum state |/). 
Namely we have assumed that the input states |/) are given as quantum information (i.e. unknown quantum states) 
whereas the template states |^)'s with known identities are given as classical information. Our goal was to obtain the 
best template as classical information (i.e. knowledge of the identity of the best \g%)) via a suitable matching strategy 
which is represented by a probability operator measure (POM), also referred to as a positive operator valued measure 
(POVM). 

In the present paper we relax the ingredients of our previous formulation in the following way. That is, instead 
of fully knowing the identities of the template states we may be given only some finite number (K) of copies of 
each template (so our original formulation is equivalent to K = oo). One matching strategy would then be to apply 
state estimation to the sets of K copies and proceed as in our original formulation with the resulting estimated 
state identities. But this is unlikely to be an optimal strategy, since any intermediate measurement process generally 
degrades the classification performance, as shown in Ref. ||. Following the criteria (i)~(iii), we should consider 
a more fully quantum procedure which, for any input |/), identifies the best template class without attempting to 
obtain any further information about the identities of the template states themselves. 

Unfortunately, however, it seems still difficult to deal with this problem in general contexts. Therefore, we mainly 
consider here some tractable cases in order to illustrate how the quantum matching strategy should work in general. 
In particular, we assume that we a priori know that the input feature state |/) and the template states \g\) and ^2) 
belong to the following parametrized families of pure quantum states: 

I/) ee -L(| T ) + e^|j», (1) 

\9i) = -^(|T)+e^|l>), (2) 
\g 2 ) eee _L(| T ) + e^||>), (3) 

where the parameters /, and #2 are completely unknown. In this model, we will compare the semiclassical 
matching strategy which is obtained by a natural extension of the conventional matching strategy, and its fully 
quantum counterpart which we will identify as the universal quantum matching machine. 

II. SEMICLASSICAL MATCHING MACHINE 

We are now given only some finite number (K) of identical samples of each template \gi) which represent the 
features of a class C*(= 1, M), but whose state identities are completely unknown. The input state |/) is also given 
as an unknown quantum state and we have N identical copies of |/). For simplicity we set M = 2, i.e., we study the 
problem of binary classification. Thus we start with a system described by the state 

\*) = \ff N ®\ gi f K ®\g2)® K . (4) 

We first analyze a semiclassical strategy which is a natural extension of conventional matching strategies. That is, 
we first apply state estimation to the template states, design a classifier based on the results, and then apply this to 
measure and classify the input feature state, (see Fig. |l|). This strategy is represented by two kinds of POMs; the 
first is for estimating the identities of the given unknown template states from the sets of K samples gf K (g) gf K {g% is 
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FIG. 1: The semiclassical matching strategy. The POM g'2)} is for estimating the unknown template states. The output 

{91,92} is used to design the classifier POM. In other words, using the training data gf K ® gf K , we fix the classifier to learn 
the appropriate template parameters. 



understood as \gi) (gi\). This POM is indexed by the possible outcomes {g[, g' 2 } about the template identities and is 
denoted by {(i(g[, #2)}- The other is for classifying the input feature state with TV samples f® N . This POM consists 
of two elements {^i(#i, #2)? ^2(#i, #2)} ano ^ should be the optimal matching strategy for the estimated templates 
{^1,^2)5 which was already given in our previous paper ||. In this way each classifier POM element depends on the 
estimated parameters {g'^g^}- 

The problem here is then to find the optimal estimation strategy {A^i? ^2)}- Such a strategy should maximize the 
following average score: 



sc 



2ir 



dg 1 dg 2 dfTi 



(5) 



The second trace-term in Eq. (||) is the conditional probability of having the outcomes {g[, g 2 } for the template states 
{df K idf K }- The first trace-term is then the conditional probability that the input state / is classified into the j-th 
class when an appropriate matching strategy is applied to the N identical input samples f® N , and \{f\gj)\ 2 is the 
conditional score. 

Using the conventional terminology of pattern matching theory, the POM {/i(#i, g 2 )} corresponds to the learning 
process to train the classifier {Clj (g[, g 2 )} with given training samples {gf K ,gf K }- A well known method is the 
adaptive learning algorithm in which one first measures each pair of the training samples {#1,^2} and then updates 
the classifier parameters step by step for K iterations under some appropriate updating rules. In contrast, the optimal 
learning strategy is expected to be a POM {A(<7i5 #2)} acting collectively on the state gf K (g) gf K , i-e., fully exploiting 
the power of quantum entanglement. 

The main purpose of this section is to develop a Bayesian formulation for the optimal learning strategy. First we 
introduce the score operators 



W{g 3 ) = ^J 2 J dff® N \{f\ 9j )\\ 



just as in our previous paper, and rewrite Eq. 

2 r r 27i 



as 



] 92 



(9i,9' 2 ) 

We then further introduce a learning score operator 

2 



3=1 



(6) 



(7) 



dgidg 2 gf K ®gf 







and rewrite Eq. (|^) as 



S sc = £ Tr [Mi,92)G{g[,g' 2 ) 



(8) 



(9) 
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Thus the problem of finding the optimal learning strategy reduces to the estimation problem of the classifier parameters 
g[ and g 2 through the learning score operator G(g[,g 2 )- 



Tr 



If the 



Let us now proceed with the explicit calculation. We first need to evaluate J2j=i 

score operator W(gj) were replaced by W(g'j), then this quantity would be nothing but the average score appearing 
in the quantum template matching problem discussed in our previous paper ||. In our previous work, the set 
{^1,^2} was designed for the a priori known parameters g\ and #2 of the template states. On the other hand, the 
POM {Cli(g[, g 2 ), ^2(^1 7 #2)} nere should be designed for the estimated parameters {g[, g 2 }, while the score operators 
correspond to the unknown template states g\ or g^. 

By definition, the POM {(li(g[, g 2 ), ^2(^1,^2)} should maximize the average score for W{g'j) instead of W(gj), i.e. 
we should maximiize 



S' = £ Tr [wWflj^gti] = \ + Tr [(w(g[) - W(g' 2 )) ClM,^) 



(10) 



where the resolution of the identity ^i(<7i, #2) + ^2(^1,^2) = ^ was used in the second equality. The score operator 



&i(9ii 92) should be then taken to maximize Tr 



, that is, it should be the projection 



'(W{g[)-W{g> 2 j) OK, g' 2 ) 

onto the subspace corresponding to the positive eigenvalues of the operator W(g[) — W(g 2 ). The score operators 
are built from the tensor product of N identical copies of the input system, |/) 0iV , and they are most appropriately 
described on the N + 1 dimensional totally symmetric bosonic subspace of TL® N , Hb, where {|m)} is the occupation 
number basis for the j component. The score operators can then be written in the form 



w(g') 



2 N+1 



N 



N-l 



£2 



Therefore 



m) (rn 



N-l 




'j\m+ l)(m| + e~*^|ra)(ra + 1| 



W(g[) - W{g' 2 ) 



where we have introduced O 




e i(e+7r/2)| m . 



l)(m\ 



-i(e+7r/2) 



m) (m + 1 



m=0 



9i+9k an d# = slzSa. Eq. (G 



can also be rewritten as 



AW(G + J, 0) = Wig',) - W(g' 2 ) = V(Q + J)AW(0, 0)V\O + J), 



where 



N 



(11) 



(12) 



(13) 



(14) 



and 



N-l 



aw(o,o) 




m=0 



Let the spectral decomposition of AW"(0, 6) be 

£W(0,0)= Am|A m )(A, 

and introduce the POM 



N 



m=0 



l)(m| + |m)(m + 1|). 



Al = ^2 l A m)(A m |, A 2 = l A ™M A ^ 



X m >0 



A m <0 



(15) 



(16) 



(17) 



Note that the {|A m )}, and hence the {Aj}, do not depend on while the eigenvalues A m are proportional to s'm6. 
The optimal strategy for maximizing S f (Eq. (p0|)) is then expressed by the POM 



%(<?!, 22) = %(©) = v-(e + |)A^(e + 1). 



(18) 
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FIG. 2: The configuration of the template states in the Bloch sphere representation. The input feature state and the template 
states lie on the great circle including the x and y axes in the Bloch sphere. 



The parameter represents the relative position of the pair of the estimated states g[ and g' 2 from the a^-axis in 
the Bloch sphere. This is the only parameter needed to specify the classifier, that is, the one to be learned from the 
training samples {gf K , gf K }- The angle between g[ and g' 2 in the Bloch sphere, on the other hand, is irrelevant for 
the design of the classifier. The state configuration is depicted in Fig. ||. 
Using Eq. (|l8|) we then obtain 



2 



J2 Tr 

1 

2 



Tr 



n^Wigj) 



- + [sin(g 1 - 6) - sin(# 2 - 6)] Rn, 



where 



R 



N 



2 iV+ 



=0 



N 
m + 1 



(ra|Ai \m + 1). 



Therefore the learning score operator in Eq. (g) can be also rewritten as 

2 n r 27r 



] 92 



Then the average score of Eq. (||) finally reads 

s sc = J2Tv [a(6)G(9) 



sin (51 - 9) - sin(g 2 - ©) 



Rn 



(19) 



(20) 



(21) 



(22) 



After the integration of g\ and Q2, the learning score operator G(0) is represented by 



where 



C 



G(9) = ^C®C + ^- [l>(0) <g> C - C <g> £>(9) 



1 K 



2^ 

i 

2* 



fc=0 



ifc) (fc|, 



A' 



j )(e ie |A+l>(fc|-e- <e |*)<* + l|). 



(23) 

(24) 
(25) 
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The basis {\k)} is the symmetric bosonic basis for the system of K identical copies of the template states or glf • 
Although we have not succeeded yet in deriving the optimal POM {ft(Q)} maximizing the above score S for general 
K, in order to show how the method works we present here as an example three different kinds of optimal learning 
strategies for the case K = 1. 

The first one is the group covariant continuous POM. First observe that the spectral decomposition of G(Q) is as 
follows 

G(Q) = Ii±|M | 0+(e) ) (a+ (9)| + I |T> (T\ + 1 |ao(e)> (a (6)| + (1 ~* Rn) |a_(9)> (a_(6)| , (26) 



where we have introduced 



|a + (6)> ee \ (-e-'( e +t) |TT) + V2\S) + e i(e+ *> |U>), (27) 

|ao(0)> = I( e - i ( e +f)|TT) + ^|5)-e i (° + t)|U)), (28) 

|a_(0)> = -L( e -( e +t)|TT) + e'( e +t)|U)), (29) 

ID = -L(|U) + |TT)), (30) 

|5> = ^(lU)-ITT)). (31) 



If we symbolically denote 



the optimal POM can be written as 



and the average score is given by 



where 



G(O) = 1g(6) © I |D <T| , (32) 



A(©) = £(©) © 1^) (D , (33) 



5 sc = i + iTrf, (34) 



2^ 



d6/2(e)G(e). (35) 



So we would like to find the POM /2(G) maximizing TrT. We can see that the square root measurement based on 
the maximum eigenvalue state |a+(G)) is actually the optimal POM. In fact, using 

& = hC de ]a+m {a+m = i ln) (TTI + \ ]s) {s] + 1 lu) (ul ' (36) 

the square root measurement is constructed by 

m(©) = |m(@))(m(@)I, (37) 

|A(6)> = a-i |o+(e)> 

= _ e -i(©+f)|TT) + |5)+e^ e +f)|||). (38) 

We then have 

f = (1 + y/2R N ) ITT) (TTI + (1 + 2y/2R N ) \S) (S\ + (1 + y/2R N ) \U) (U| . (39) 

It is almost straightforward to prove that T — (5(G) > (i.e., by seeing the eigenvalues 3, 1, and 0), and that 
[f - G(G)]/i(G) = 0. Thus the POM of Eq. @ is optimal [| |[ §, and the maximum average score is 

SSL = 5 + ^f- (40) 
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The POM in Eq. (B8) is group covariant as specified by 



\m) 

V(Q) 



y(e)|A(o)> 
e- ie |TT)(TTl 



\S) (S\ 



III) (III- 



(41) 
(42) 



The second optimal learning strategy is the discrete version of the above strategy. Actually there are many equivalent 
discrete POMs attaining the same maximum average score /S^ax- The strategy requiring the minimum number of 
outputs is most appreciated practically. This can be directly read from Eq. (37) as {/i(0), /i(^ L ), /i(^ L )}. 

These two strategies have been derived from quantum estimation theory in the standard way, that is, by taking the 
symmetry of the operator G(Q) into account. On the other hand, we may also derive another solution from intuitive 
considerations in the following way. Since the parameters g\ and #2 specifying the template states are completely 
unknown, the two template states are independent, i.e., there is no a priori correlation between them, and they are 
just described by the product state \g\) <S> |#2)- It might then be sensible to expect that there should exist an optimal 
learning strategy based on the separate measurement on each template state. Yet the relative direction between the 
two measurements might be correlated for us to be able to choose the appropriate classifier {^i(6), (^(O)}- We may 
then apply a von Neumann measurement on each template to know about the state identity. Let us define the two 
von Neumann measurements 



\A ± ) = ^(lt)±ll>), 



\B±) 



i(m±<u>). 



V2 



We can then show that the four output POM with the corresponding guesses for O 

HOo)) = \A+)®\B+), &o = -3tt/4 

|/i(0i)> = |A+)®|B_>, e x = -tt/4 

|m(6 2 )> = \A-)®\B+), 6 2 = 3tt/4 

|A*(©3» = \A_)®\B_), 9 3 = tt/4. 



(43) 
(44) 

(45) 



is also an optimal learning strategy. Actually, it can be seen 

that E;=o Tr IM©i)> (K®i)\G(Qi) \ is just the maximum 
average score *§^ x . Note that in this case, however, the POM of Eq. J45) is no longer group covariant. 



III. UNIVERSAL QUANTUM MATCHING MACHINE 

The strategies described in the previous section would be a good and practical matching strategy. But this is not 
optimal and there is a more fully quantum procedure which extracts only the required information, i. e., the classical 
information on which class is best matched with |/), without attempting to obtain any further information about the 
identities of the template states themselves. The total system at our hand is now represented by the state 

\*) = \f)® N ®\gi)® K ®---®\9Mf K . (46) 

The optimal strategy can then be defined by a straightforward extension of the Bayesian formulation given in our 
previous work ||, with the score operators now defined by 

Wi = (J^j Jdg 1 ---Jdg M J df |*) <*| x \(f\ 9i )\ 2 P(f), (47) 

where P(f) is the a priori probability distribution of the input feature parameter (taken here as uniform, i.e. P(f) = 
-^). The new ingredients in the present formulation are just the additional integrations over the unknown parameters 
for the template states. The fully quantum optimal strategy is obtained as a POM {Ilj} that maximizes the following 
average score 

M 

£QM Tr {Willi). (48) 



Once parametrized families of input and template states are specified, the obtained solution is expected to work 
equally well for any states belonging to such families by its definition. In this sense we might call this optimal POM 
as a universal quantum matching machine. 
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A. Example: Two state system with M — 2 and K — 1 



Although the definition of the universal quantum matching machine is straightforward, it is in general a difficult task 
to derive an explicit expression for the corresponding POM. Here we present an illustrative example to demonstrate 
how the universal quantum matching machine works and attains a performance which cannot be reached by any other 
conventional (semiclassical) matching strategy 

As usual by now, the full input system \f)® N is most appropriately described on the TV + 1 dimensional totally 
symmetric bosonic subspace TLb as 



E V 2^ 



fc=0 



\k), 



(49) 



where {\k)} is the occupation number basis of the j-component. In the case of a binary matching problem (M = 2) 
we have the two score operators 



1 



2N+2+2K 



N K K 



2 EEE 

k= 
N-l 

E 



k=0 ra=0 n=0 
N-l 



K 



k=0 



N 



n 

K-l K 



\k, m, n)(k, m, n\ 



K 
m + 1 



HI 

7 m=0n=0 

+ 1, m, n)(k, m + 1, n\ + \k, m + 1, n)(k + 1, m, 



(50) 



W 2 



1 



m I \n 

K K-l 



2 EEE( fc V 5 )|fc,m,»)<ft,m,n| 

E 

fc=0 



k [ k + i J 2^ 



m=0 n=0 



m 



n / V n + 1 



x ^|fc + 1, m, n)(k, m, n + 1| + \k, m, n + l)(k + 1, m, 



(51) 



where |fc,m, n) = |fc) |ra) (8) and {|fc)}, {|^)} ? and {|n)} are the occupation number basis of the j-component 
for \f)® N , \gi)® K , and l^) ^, respectively We are to maximize the following quantity 



5 QM = i + Tr 



(Wi - w 2 )iii 



(52) 



As already explained in section [□] below Eq. jiol) , the problem reduces to finding the subspace corresponding to the 
positive eigenvalues of the operator W\ — W<i- From Eqs. (|5(]) and ( |5l|) we have in the case of K = 1 that, 



N-l 



2^+4 



N \ f N 
k + 1 



|fe + l,00)(fc,5| - |fc,S)(fc + l,00| 
|fc + l,S)(Ml| + |fc,ll)(fc + l,S| 



(53) 



where the state \k + 1,00) is understood as \k + 1) ® |0) ® |0). (Note that in the if = 1 case we have that IS 1 ) = 
(|01) - \10))/y/2 = (| Tl) - I it»/A i.e., the bosonic space for the template states reduces to the original one-qubit 
space.) The operator W\ — W2 can be finally arranged into a direct sum as 



N 



W 1 -W 2 = @AW k: 



(54) 



k=0 
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where 



N-l 



2^+4 



k=l 



and 



N 



x J (|fc + 1, 00) (fc, 5| + |fc, 5) (fc + 1, 00|) 



( fc J ! 1 )(|fc,5>(fc-l,ll| + |fc-l,ll>(fc,5|) 



AW = 



V2F 
2^+4 

27V 



(-|1,00)(0,5|-|0,5)(1,00|), 



AW/v = %„+i(\N,S)(N- 1, HI + |JV- 1,11>(JV,5|). 



Subsequently, the A Wife's are diagonalized as 

V2 ' ' 



AW k 



N 

2^+4 V V k 



N 



(l<k<N-l), (55) 
(56) 
(57) 



1 < k < N - 1), 



(58) 



\/2N / \ 
A ^o^|vTl(|l + )(l + |-|l-)(l-l), 



and 



where 



AWat 



\/27V / \ 
^ (|iV + 1 + )(N + 1+| - \N + 1-)(N + 1_|), 



l*±> 



for 1 < fc < AT - 1, 



and 



y/2* 



N 



k+ 1 ) 1 V k- 1 



TV 



|1±) = ^(T|1,00) + |0,5)), 



|JV+1±> = -^(±|JV ) 5) + |JV-1,11)) : 



respectively. Therefore the optimal matching strategy is described by the POM 

7V+1 AT+1 

n 1 = £|fc + )(fc + |, n 2 = £|fc_><fc_|, 



k=l 



k=l 



and the optimal attainable average score is given by 



2 2^+ 4 



2VN 



N-l 

£ 

fe=i 



k+ 1 



N 
k-1 



(59) 



(60) 



W(fc+i) |&+i ' oo)+ A fc +0 + (fc-0 |fc ' 5)± v( fc -0 |fc " i ' ii) 



(61) 



(62) 



(63) 



(64) 



(65) 



This score obtained by the universal quantum matching machine should be compared with the optimal score 
S sc of the semiclassical matching strategy based on the learning process. Fig. || shows the average score by the two 
kind of matching strategies as a function of TV, the number of input feature samples. The big dots represent the 
average score by the universal quantum matching machine in the case of K = 1. This is larger than the one by the 
optimal semiclassical matching strategy based on the learning process, shown by the big circle. As K increases, we 
expect larger score although the values can not be plotted because we have not succeeded yet in deriving the optimal 
solution for general K. For K = oo, we have derived the maximum attainable score in [||, which is shown by the 
solid line in Fig. 0. The dashed line is for the semiclassical matching by majority voting. 
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FIG. 3: The average score as a function of the number of input feature systems. The big dots and circles represent the 
attainable scores by the universal quantum matching machine and the optimal semiclassical matching strategy based on the 
learning process, respectively, in the case K = 1. The solid and dashed lines are the scores in the case of K = 00 derived in 
our previous paper. 



IV. CONCLUDING REMARKS 



We have considered a full quantum extension of the binary quantum pattern matching problem which was addressed 
in the recent paper [||. In such a problem, given unknown template states \gi)® K and \g2)^ K and an input feature 
state \f)® N : we are to decide to which template the input feature state is closest in the sense of the fidelity criterion. 
We have presented two kinds of matching strategies, that is, a semiclassical matching strategy based on learning and 
a universal quantum matching strategy. In particular, we have explicitly derived the Bayes optimal learning strategy 
for the semiclassical matching and the optimal universal quantum matching strategy in the case of one copy, K = 1, 
for the template states, and an arbitrary number N of copies of the input state. Our previous results in H correspond 
to the case of K = 00. 

For general K > 2, the Bayes optimal solutions for both the semiclassical learning strategy and the universal 
quantum matching strategy are still not known. Concerning the optimal learning strategy used in the semiclassical 
matching problem, one of the interesting questions would be whether there exists an optimal separable strategy of the 
type as described in Eq. (p5[). From a preliminary analysis of the case of K — 2, the POM similar to Eq. (po]), which 
is now the product made of two 3-output von Neumann measurements, does not satisfy the Bayes optimal condition 
for state estimation. What would then the optimal learning strategy look like in this case? Of course there should 
be a group covariant POM which is generally an entangled measurement on the two templates. Is such an entangled 
measurement the only optimal learning strategy? If so, it would be surprising because the two templates have no 
a priori correlation. Or are there other kinds of separable measurements? As for the universal quantum matching 
machine, the problem would just reduce to finding the appropriate division of the Hilbert space, but for larger K this 
becomes a tedious task. 

The reader might feel that the model used in this paper is in some respect artificial. In fact, this model is 
still far away from practically encountered situations. However, we may say that an important aspect of quantum 
pattern matching problem is already seen. Namely, there certainly exists a full quantum matching procedure as the 
universal quantum matching machine which is strictly superior to the straightforward extension of the conventional 
matching strategy based on the learning process of the classifier with the training template samples. The derived 
universal quantum matching machine, i.e., the POM in Eq. provides a typical matching model for extracting the 
meaningful information about the best template class without attempting to obtain any further information about the 
identities of the template states themselves, excluding any intermediate measurement process. In the similar spirit, it 
is worth mentioning the recent work on the comparison of two unknown quantum pure states |J , where the quantum 
optimal comparing strategies are derived for several criteria. 
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In practical applications, the input and the template systems will be more complicated, and possibly associated 
with secondary features which are not relevant to the pattern classification. So, as already pointed out in it would 
be of practical concern how to enhance the features of interest and to quarry the essential components (subspaces) 
of the quantum system for the pattern classification. In the scenario where the input and template identities are 
completely unknown, we might rely on a two stage procedure: first estimate the input and template identities to 
extract important features by using some set of aymptotically vanishing measure of the given samples; then discard 
redundant parts of the input and the template systems, and cut an effective subspace out of the original quantum 
Hilbert space; finally, after the feature enhancement process, carry out a fully quantum pattern classification procedure 
in the smaller space. Thus, in a sense, we see that the quantum pattern matching problem naturally involves aspects 
of both state estimation and state discrimination. The former is necessary for the learning process and the feature 
enhancement, while the latter is used for the pattern classification. In this direction, it would be also interesting 
to study effective quantum matching algorithms which are simple enough in structure and easy to be implemented, 
although not necessarily Bayes optimal. 

Similarly to the case of the conventional pattern matching problem, the quantum matching algorithm complexity 
will be an important future problem. It is in fact believed that the complexity in some image recognition problems 
is in the NP-complete class. How can the quantum pattern matching problem be treated from the point of view of 
quantum computational complexity? If there will be some progress in the synthesis of a quantum network for the 
obtained optimal POM in the Bayesian approach, then it will be possible to search near optimal quantum matching 
algorithms whose complexity might be eventually lower than that of the corresponding conventional semiclassical 
approaches. 
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